Surrey-cvssp system for DCASE2017 challenge task4

نویسندگان

  • Yong Xu
  • Qiuqiang Kong
  • Wenwu Wang
  • Mark D. Plumbley
چکیده

In this technical report, we present a bunch of methods for the task 4 of Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge. This task evaluates systems for the large-scale detection of sound events using weakly labeled training data. The data are YouTube video excerpts focusing on transportation and warnings due to their industry applications. There are two tasks, audio tagging and sound event detection from weakly labeled data. Convolutional neural network (CNN) and gated recurrent unit (GRU) based recurrent neural network (RNN) are adopted as our basic framework. We proposed a learnable gating activation function for selecting informative local features. Attentionbased scheme is used for localizing the specific events in a weaklysupervised mode. A new batch-level balancing strategy is also proposed to tackle the data unbalancing problem. Fusion of posteriors from different systems are found effective to improve the performance. In a summary, we get 61% F-value for the audio tagging subtask and 0.73 error rate (ER) for the sound event detection subtask on the development set. While the official multilayer perceptron (MLP) based baseline just obtained 13.1% F-value for the audio tagging and 1.02 for the sound event detection. Finally, we ranked first in the audio tagging sub-task on the evaluation set. We also ranked 2nd as a team in the sound event detection sub-task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sketching out the details: Sketch-based image retrieval using convolutional neural networks with multi-stage regression

Sketching out the Details: Sketch-based Image Retrieval using Convolutional Neural Networks with Multi-stage Regression Tu Buia,∗, Leonardo Ribeirob, Moacir Pontib, John Collomossea aCentre for Vision, Speech and Signal Processing (CVSSP), University of Surrey — Guildford, United Kingdom, GU2 7XH bInstitute of Mathematical and Computer Sciences (ICMC), Universidade de São Paulo — São Carlos/SP,...

متن کامل

Abstract Learning via Demodulation in a Deep Neural Network

Learning via Demodulation in a Deep Neural Network Andrew J.R. Simpson #1 # Centre for vision, speech and signal processing (CVSSP), University of Surrey, Guildford, Surrey, UK 1 [email protected] Abstract—Inspired by the brain, deep neural networks (DNN) are thought to learn abstract representations through their hierarchical architecture. However, at present, how this happens is not...

متن کامل

Incorporating Variation of Model-specific Score Distribution in Speaker Verification Systems

It has been shown that the authentication performance of a biometric system is dependent on the models/templates specific to a user. As a result, some users may be more easily recognized or impersonated than others. The various categories of users have been characterized by Doddington et al.(1988). We refer to this unbalanced performance across users as the Doddington’s zoo effect. In the conte...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

The ESA Lunar Robotics Challenge: Simulating operations at the lunar south pole

Felipe A. W. Belo Interdepartmental Center “E. Piaggio,” Universitá di Pisa, Pisa, Italy e-mail: [email protected] Andreas Birk Jacobs University, Bremen, Germany e-mail: [email protected] Christopher Brunskill Surrey Space Centre, University of Surrey, Surrey, United Kingdom e-mail: [email protected] Frank Kirchner DFKI, University of Bremen, Bremen, Germany e-mail: frank.k...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1709.00551  شماره 

صفحات  -

تاریخ انتشار 2017